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The present invention concerns a method for the detection of the reception of 
a data frame in an input signal, said data frame comprising periodically 
repeated symbols at the beginning. It further concerns a frame detector, a 
synchronizing method, and a synchronizer device. 

BACKGROUND OF THE INVENTION 

The IEEE 802.1 1a standard makes use of the Orthogonal Frequency Division 
Multiplex (OFDM) transmission scheme. The main feature of the OFDM 
scheme is that the information stream is not transmitted into a single carrier, 
but is divided into several sub-carriers, each transmitting at a much lower rate. 
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Furthermore, all these sub-carriers are orthogonal, i.e., they overlap their 
spectra, but do not cause mutual interference. 

The fact that the different sub-carriers overlap their spectra makes one of the 
main operations at the receiver especially difficult: synchronization. In a 
receiver, the synchronizer is the block responsible for detecting the incoming 
frame and for estimating and correcting possible frequency offsets. The 
synchronization process is also responsible for providing a reference channel 
estimation to the channel estimation block. It further decides the starting point 
from which on the different OFDM symbols will be fed into the FFT block. The 
correct reception of OFDM signals is very sensitive to the synchronizer 
performance. 

In the OFDM transmissions considered here, the information is not transmitted 
continuously, but in bursts. Each burst contains a single frame, which is a 
compound of different OFDM symbols. 

In an OFDM transmission each data packet consists of a preamble and a data 
carrying part. The preamble symbols are placed at the very beginning of each 
frame during transmission. The preamble consists of 10 "shorf identical 
known OFDM symbols concatenated with 2 "long" identical and known OFDM 
symbols. The preamble symbols have a very specific periodic structure in the 
standard IEEE 802.11a to simplify synchronization. The data carrying part 
consists of a variable number of OFDM symbols, where each OFDM symbol 
contains useful information plus some known pilot sub-carriers, which are 
typically used for phase tracking. 

Synchronization comprises the following operations: Frame detection, carrier 
frequency offset determination, symbol timing estimation, extraction of the 
reference channel, and data reordering. The synchronization process is data- 
aided, i.e., based on the digital processing of the preamble symbols. 

During reception, the synchronizer has to peer the channel in order to detect 
an incoming packet or frame. Frame detection in a receiver is an especially 
difficult task because there is no time raster that governs the transmission of 
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the frames. In other words, the receiver does not know when to expect an 
incoming frame. The object of frame detection is to determine the symbol 
boundary so that correct samples for a frame can be taken. A major problem 
in the use of OFDM is therefore the determination of the time instant, at which 
the receiver starts sampling a new frame. A mismatch in the determination of 
this parameter would introduce a phase error causing intercarrier interference 

(ICI). 

A method for frame detection in an OFDM signal is described by Chiu et al. 
(Yun, Chiu, Dejan Markovic, Haiyun Tang, Ning Zhang, OFDM receiver 
Design, published at URL: http://bwrc.eecs.berkeley.edu/PeopIe/ 
Grad_Students/dejan/ee225c/ofdm.pdf). It uses the first ten short symbols 
transmitted at the beginning of an OFDM frame, also referred to as the "short 
traning sequence". The waveform of the short symbols is known and stored in 
the receiver decive. The receiver performs a correlation of the sampled signal 
with the stored waveform. It further performs and autocorrelation of the 
sampled signal with the delay of one short symbol. While the autocorrelation 
creates a first signal with a plateau over a time span during which short 
symbols are received, the correlation with the known waveform creates a 
second signal exhibiting peaks. The last peak of the second signal occurring 
during a plateau in the first signal is chosen as a time reference to start frame 
detection. 

However, in the solution proposed by Chiu et al. the frame detection does not 
work if a fading channel is affecting the signal. 

Another method is described in Schwoerer, L; Wirz, H.; "VLSI Implementation 
of IEEE 802.11a Physical Layer". Proceedings of the 6th International OFDM- 
Workshop (InOWo) 2001, pp. 28.1-28.4, Sept. 2001, Hamburg, Germany. To 
detect the periodicity of the short training sequence, a sum of the absolute 
amount of three delayed autocorrelation terms over three different, 
overlapping time spans is calculated and then compared to a threshold value 
that is scaled with the signal power over the same time spans. 
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The solution proposed by Schwoerer et al. correspondingly makes use of 
three autocorrelators for the frame detection. This implies a rather large silicon 
area necessary for implementation of their solution and a high power 
consumption. 

Summary of the Invention 

It is an object of the invention to provide a simple and reliable method and 
device for detection of the reception of a data frame in an input signal. 

It is a further object of the invention to provide a synchronizing method and a 
synchronizer with a simple and reliable frame detector. 

These objects are solved with a frame detection method according to claim 1 , 
a frame detector according to claim 14, a synchronizing method according to 
claim 1 7, and a synchronizer device according to claim 26. 

According to a first aspect the invention a method for detection of the 
reception of a data frame in an input signal (y OF F (n)) is provided. The method 
is based on the fact that the data frame comprises periodically repeated 
symbols at the beginning. The method of the first aspect of the invention 
comprises the steps of 

a) sampling said input signal (yoFF (n)) with a predetermined sampling 
rate 

b) generating a first signal (|J(k)| 2 ) that is dependent on an 
autocorrelation of said input signal with a delayed copy of said 
input signal, and 

c) detecting a plateau in said first signal (I J(k)| 2 ) 

d) generating an output signal that is indicative of detecting said 
plateau. 
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In the method of first aspect of the invention, step c), i.e., detecting a plateau, 
comprises a step of generating a differentiator signal (J dl ff (k)), which is 
dependent on the difference of a first sample of the first signal and a second 
sample of the first signal that was taken a predetermined number of sampling 
periods earlier. Further, step c) comprises a step of detecting an absolute 
maximum of said differentiator signal (J 6m (k)) within a predetermined range of 
sampling periods. 

The method of the invention makes use of the fact that a plateau in an 
autocorrelation signal, i.e., in the first signal, can be detected reliably by 
making use of a differentiation of the autocorrelation signal with a limited 
bandwidth. As is known, a plateau in the autocorrelation signal is indicative of 
the reception of a frame with periodically repeated symbols in the input signal. 

The invention is based on the finding that a real differentiation, that is, a 
differentiation with a limited bandwidth, will transform an input signal into an 
output signal that exhibits an absolute maximum at the onset of a plateau of 
the input signal. In the method of the invention, the input to the real 
differentiation is the autocorrelation signal, and the output signal is the 
differentiator signal. The bandwidth of the differentiation can be limited for 
instance by calculating the difference of the respective two autocorrelation 
signals involved with a predetermined delay of a number of sampling periods. 

A maximum of the differentiator signal can be detected with precise timing in a 
simple and reliable manner with an appropriate peak detection method. 

An ideal differentiator, that is, a differentiator with an unlimited bandwidth, 
would show a discontinuity at the onset of a plateau, assuming that no noise 
was present in the signal. 

It is noted that the term signal as used herein denotes, depending on the 
context, analog signals or digital signal, for instance binary representations of 
values of quantities, such as for instance a digital representation of a number 
indicating the current value of an autocorrelation of the input signal with its 
delayed copy. 
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The step of generating said differentiator signal (J d iff (k)) comprises for 
instance a step of delaying the first signal (I J(k)| 2 ) for the first predetermined 
number of sampling periods, and a step of generating a difference signal that 
is dependent on the difference between the first signal (I J(k)| 2 ) of a current 
sampling period and the delayed first signal (I J(k)l 2 ). 

In a preferred embodiment of the invention, the autocorrelation step (step b) 
comprises the steps of delaying the input signal by a third predetermined 
number (Nd) of sampling periods, transforming the input signal into a second 
signal that is dependent on the complex conjugate of the input signal, and 
generating a fourth signal that is dependent on the product of said second 
signal and of said delayed input signal. Preferably, an autocorrelation signal is 
generated that is dependent on a sum of a fourth predetermined number Navg 
of autocorrelation signals. 

To accomplish the calculation of the sum, a fourth signal can be saved for Navg 
sampling periods. The autocorrelation signal is created by adding said fourth 
signal of a current sampling period to said autocorrelation signal of a last 
previous sampling period and subtracting one fourth signal, that was saved 
Navg sampling periods earlier. Preferably then, the first signal is obtained as 
the product of the autocorrelation signal and its complex conjugate. 

In a further preferred embodiment, the step of detecting an absolute maximum 
of the differentiator signal within a second predetermined number of sampling 
periods comprises an instantaneous peak detection step and a step of 
detecting a falling slope in the differentiator signal. Preferably, these two steps 
are performed in parallel. Instantaneous peak detection refers to the detection 
of a relative peak in the differentiator signal. The detection of a relative peak at 
the same time a falling slope occurs is an indication of the occurrence of an 
absolute maximum in the signal within the second predetermined number of 
sampling periods. In this embodiment, the detection of an absolute maximum 
in the differentiator signal is implemented with a simple and reliable method. 



WO 2004/008706 



7- 



PCT/EP2003/007750 



Instantaneous peak detection involves, for instance, comparing the 
differentiator signal (J d iff (k)) of a current sampling period with the differentiator 
signal (J dlff (k)) of a last, i.e., a next previous sampling period, and a step of 
saving the differentiator signal (J d iff (k)) of the current sampling period to a 
register, given the condition that its value is larger than that of the differentiator 
signal (J drf f (k)) of the previous sampling period. The register therefore always 
contains the largest differentiator signal. 

Preferably, a step of incrementing a count index by one is performed, if the 
value of said differentiator signal (Jam (k)) of the current sampling period is 
equal or smaller than that of said differentiator signal (Jcwr (k)) saved in said 
register. A (third) signal can be generated indicative of the condition whether 
or not the count index has reached a predetermined value. The third signal 
indicates that a relative peak has been found within the sample range defined 
by the predetermined counter value, which can be the counter limit. The 
relative maximum is the value contained in the register. 

Detection of a falling slope in the differentiator signal (J dlff (k)) is preferably 
performed by a group peak detection method. That means, the differentiator 
signal is accumulatord in groups of six samples and the present group is 
compared with the previous one. If the present group is smaller than the 
previous one, the falling slope has started. In detail, an accumulation signal is 
generated that is dependent on the sum of the differentiator signal (Jam (k)) 
over a predetermined accumulation number of consecutive sampling periods. 
The current accumulation signal is compared with the last previous 
accumulation signal representing (without overlap) the accumulation number 
of consecutive earlier sampling periods, if the value of the current 
accumulation signal is smaller than the value of the earlier accumulation 
signal, a group peak detection signal is generated. In an alternative 
embodiment, the group peak detection is a binary signal that is for instance a 0" 
if the current accumulation signal is larger, and tt T if the current accumulation 
signal is smaller than the previous one. The group peak detection signal thus 
indicates that the differentiator signal is decreasing. 
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A maximum detection signal can be generated if the instantaneous peak 
detection signal indicates that the count index has reached the predetermined 
count value and the group peak detection signal indicates that the value of the 
current accumulation signal is smaller than the value of the earlier 
accumulation signal. Preferably, the maximum detection signal is the output 
signal. 

It is noted that with the peak detection algorithm described above the plateau 
will be detected with a small delay due to the instantaneous peak detection 
method. The delay corresponds to the predetermined count value used in the 
instantaneous peak detection. However, this delay has no influence on the 
result of the following steps in the synchronization. 

Preferably, the input signal is amplified such that the power of the amplified 
input signal is in a predetermined power range. The purpose of this step is to 
adapt the input power to a power level according to the conversion range of 
Analog-Digital converters. In order to avoid detecting a plateau in pure noise, 
the step of detecting a plateau in the first signal (|J(k)| 2 ) is preferably 
performed only if the first signal I J(k)l 2 exceeds a predetermined threshold 
value. 

The method of the invention is preferably used for detecting a data frame 
containing OFDM symbols. 

According to a second aspect of the invention, a frame detector is provided for 
detecting the reception of a data frame in an input signal (yon= (n)), said data 
frame comprising periodically repeated symbols at the beginning. The frame 
detector has 

a) a sampling unit adapted to sample said input signal (yoFF (n)) with a 
predetermined sampling rate 

b) an autocorrelation unit adapted to transform said input signal (y 0 i=F (n)) 
into a first signal (I J(k)| 2 ) that is dependent on an autocorrelation of said 
input signal with a delayed copy of said input signal, and 
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c) a plateau detector, adapted to detect a plateau in said first signal 
(iJ(k)l 2 ) 

d) an output unit adapted to generate an output signal that is indicative of 
detecting said plateau. 

The plateau detector is adapted to generate a differentiator signal (J<fiff (k)) f 
which is dependent on the difference of a first sample of said first signal and a 
second sample of said first signal a predetermined number of sampling 
periods earlier, to detect an absolute maximum of said differentiator signal (Jdsr 
(k)) within a predetermined range of sampling periods, and to provide a signal 
indicative of detecting said absolute maximum to said output unit. 

The frame detector of the second aspect of the invention implements the 
frame detection method of the first aspect of the invention. In what regards 
advantages reference is made to the above description of the method of the 
invention. 

A preferred embodiment of the frame detector of the second aspect of the 
invention implements the plateau detection method according to the preferred 
embodiment of the method of the invention described earlier. Preferably, said 
plateau detector of the frame detector comprises a peak detection unit with a 
first detection unit connected to the input port and comprising a first memory 
unit. The first detection unit is adapted to compare the input signal (Jdiff (k)) 
received through the input port with a first entry contained in the first memory 
unit, and to replace the first entry by the input signal if the value of the input 
signal (Jdiff (k)) is larger than the value of the first entry. The first detection 
unit implements the instantaneous peak detection method. 

In this embodiment, there is preferably a second detection unit connected to 
the input port and comprising a second memory unit. The second detection is 
adapted to generate an accumulation signal that is dependent on the sum of 
the differentiator signal (J d iff(k)) over a fourth predetermined number of 
consecutive sampling periods, to compare the current accumulation signal 
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with the last previous accumulation signal representing without overlap the 
fourth predetermined number of consecutive earlier sampling periods, and to 
generate a group peak detection signal indicative of whether or not the value 
of the current accumulation signal is smaller than the value of the earlier 
accumulation signal. 

Further preferred embodiments of the frame detector implement preferred 
embodiments of the frame detection method described earlier. From that 
description it is obvious to a person skilled in the art how to implement the 
preferred embodiments of the frame detector. 

It is noted that the peak detection method as well as the peak detection unit 
used in the present context for detecting a plateau in the autocorrelation signal 
can be used for peak detection in any signal. The peak detection method as 
well as the peak detection unit represent an independent invention. 

According to a third aspect of the invention, a synchronizing method is 
provided. The synchronizing method comprises a step of frame detection, 
which is carried out using the method of the first aspect of the invention or any 
of its embodiments described above or in the dependent claims. 

A carrier frequency offset, which is due to a frequency mismatch during RF 
downconversion, introduces an unknown phase factor to each signal sample. 
This unknown phase factor must be estimated and compensated for each 
sample before performing a Fast Fourier Transform (FFT) at the receiver. 
Otherwise the orthogonality between subcarriers is lost A loss of orthogonality 
is also creates intercarrier interference. 

A preferred embodiment of the synchronizing method of the third aspect of the 
invention therefore further comprises a step of estimating a relative frequency 
offset (f B ) in an input signal (yopKn)). wherein the estimating step comprises 
the steps of 

a) estimating a coarse frequency offset (P) 
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b) estimating a fine frequency offset (a) in dependence of said 
estimated coarse frequency offset (p). 

in this embodiment, the steps of estimating a coarse frequency offset (p) 
and/or of estimating a fine frequency offset, i.e., step a) or b) or both steps a) 
and b), preferably comprise a step of calculating a phase value of said first 
signal (I J(k)| 2 ). 

The step of estimating the frequency offset comprises in one embodiment a 
step of assigning a fine frequency offset value dependent on the value of the 
coarse frequency offset according to the following function: 



e=a 


; if (-0.1)/4£p<;(0.1)/4 
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In an alternative embodiment that is slightly more robust to noise than the 
embodiment just described the step of assigning a fine frequency offset value 
dependent on the value of the coarse frequency offset is performed according 
to the following function: 

e=a ; if (-0.25)/4< p < (0.25)/4 (R1 ) 

s=a ; if a > 0 and (0.25)/4< p < (0.75)/4 (R2) 

6=1 +a ; if a < 0 and (0.25)/4< p < (0.75)/4 (R3) 
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Preferably, after estimation the frequency offset, a step of correcting the input 
signal by the estimated value is performed. 

A further embodiment of the synchronizing method of the invention comprises 
after said step of frequency offset correction a step of estimating the time of 
reception of at least one symbol contained in a received data frame 
(hereinafter referred to as a symbol timing step). 

The symbol timing step preferably comprises a step of generating a 
crosscorrelation signal, which is dependent on the value of the 
crosscorrelation of the corrected input signal with a known reference signal, 
wherein the reference signal is a first section of long preamble symbols 
included in the data frame. In a preferred embodiment, the reference signal is 
32 samples long. 

Another embodiment comprises a step of estimating a reference channel, 
preferably using a step of performing a Fast Fourier Transform of a second 
section of the long preamble symbols included in the data frame. 

According to a fourth aspect of the invention, a synchronizer device is 
provided. The synchronizer device comprises a frame detector according to an 
embodiment of the frame detector of the second aspect of the invention. 

The synchronizer device preferably further comprises a symbol timing unit 
adapted to generate a crosscorrelation signal, which is dependent on the 
value of the crosscorrelation of the corrected input signal with a known 



WO 2004/008706 



13 



PCT/EP2003/007750 



reference signal, wherein the reference signal is a first section of long 
preamble symbols included in the OFDM frame. 

The symbol timing unit comprises in one embodiment a crosscorrelation unit 
with a number of multipliers for complex numbers, and wherein at least one 
multiplier is made of a combination of XNOR-gates, inverter gates and adders. 

To save power, the frame detection unit and the symbol timing unit can in one 
embodiment be enabled or disabled individually. This is preferably 
implemented by means to enable or disable the clock for the individual unit It 
is noted that one clock is used for both units. 



RR1FF DESCRIPTION OF THE DRAWINGS 

In the following, a preferred embodiment of a synchronizer device according to 
the invention will be described with reference to the figures. 

Figure 1 shows the preamble symbols used in the IEEE 802.11a 

standard. 

Figure 2 shows a scheme of a general delayed autocorrelator. 

Figure 3 shows a detailed scheme of a moving average block in the 

autocorrelator of Fig. 2. 

Figure 4 shows a signal at the output of the delayed autocorrelator for 

two possible values for N d keeping N av g = N<j- 

Figure 5 shows a structure of an embodiment of a plateau detector, 

Figure 6 shows a sketch of the signal J d m(k) at the output of the 

differentiator, 

Figure 7 shows a detailed scheme of the several signals involved in the 

peak detection procedure. 
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Figure 8 shows a particular implementation of the peak detector. 



Figure 9 



Figure 14 



Figure 16 



Figure 17 



Figure 1 8 



shows the dependency of the phase of J*(fc) with respect to the 
input carrier frequency offset (normalized) for two particular 
values of N<j 



Figure 10 shows a scheme of a frequency offset estimator used in the 

synchronizer device. 

Figure 1 1 shows the different decision regions for the combination of a 

and p in the frequency offset estimation and correction. 

Figure 12 shows a timing diagram of the whole synchronization 

procedure. 

Figure 13 shows a scheme of an XNOR-based complex multiplier. 



shows possible simplifications of the XNOR-based complex 
multiplier if the input B is known a priori. 



Figure 15 shows a structure of the crosscorrelator used for timing 



estimation. 

shows the square magnitude of the output of the crosscorrelator 
when the reference signal is the portion of the long preamble 
symbols shown in Figure 12. 

shows a general synchronizer scheme with different clock 
domains. Each clock domain is activated when it has to carry 
out some kind of operation. If not, it is disabled in order to 
reduce the power consumption of the whole synchronizer. 

shows a representation of the two operation modes of the 
circular CORDIC algorithm being used in the synchronizer. 
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DESCRIPTION OF PREFERRED EMBODIMENTS 

Figure 1 shows the structure of a preamble 10 of a frame used in the IEEE 
802.1 1a standard. It is composed of ten short preamble symbols referred to as 
ti, t 10 , each having a length of 0.8 us. A short preamble symbol is sampled 
16 times. Furthermore, there are two long preamble symbols 12 and 14 
(ABCD-ABCD) with a double cyclic prefix 16 (CD), thus resulting in the 
structure CD-ABCD-ABCD of the second section of the preamble. The full 
preamble as shown has a duration of 16 [is. 8|js is the duration of the short 
ten preamble symbols, and 8 ps is the duration of the long preamble symbols 
with the cyclic prefix. 

In the following, an embodiment of a frame detector is described in detail. The 
description also gives a clear picture of an embodiment of a method for frame 
detection that is implemented in the frame detector. 

During synchronization the reception of a frame is to be detected. This 
detection is based on the particular periodic structure of the preamble 
symbols. The circuit used for this purpose is sketched in Figure 2, 
representing a scheme of a general delayed autocorrelator 20. It is based on 
an autocorrelation of the input signal with a delayed version of itself. 

The autocorrelator has a delay block 22 introducing a delay of N d clock cycles 
to an input signal y 0 H=(n). In parallel to delay block 22 a conjugator block 24 is 
provided that generates a complex conjugate of the input signal yo/^faj- The 
output signals of the delay block 22 and of the conjugator block 24 represent 
the input signals to a multiplication stage 26. The output of multiplication stage 
26 is fed into a moving average block 28 that will be explained in further detail 
with reference to Fig. 3. The moving average of the input signal of block 28 
consists of the addition of the most recent Navg samples. The output signal J(k) 
of moving average block 28 is fed into a block 30 that provides at its output the 
absolute amount of its input signal. 

The output of the autocorrelation block can be expressed as 
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J(to = X y OFF il-k)- yoFF {J-k-N d ) ( 1 ) 

Two parameters are important here: N d and N avg - The former (N d ) is the length 
of the delay introduced by delay block 22 and the latter (N avg ) is the length of 
the moving average block. 

The selected value for Nd directly depends on the periodicity found in the 
preamble. From Figure 1, several periodicities are possible, i.e. 16, 32, 48, 64 
and 80 samples. Generally, the value for Navg is selected to be the same as 
N d . 

Fig. 3 shows a detailed scheme of the moving average block 28 in the 
autocorrelator. Its structure is like that of in a Finite Impulse Response (FIR) 
filter, but with the advantage that all the coefficients are one. Thus, it is not 
necessary to add all the N av g samples stored inside the register each time 
when a new sample comes in, but only add the new sample and subtract the 
oldest one. 

The structure of the moving average block sketched in Figure 3 comprises 
Navg delay elements, three of which are shown with reference signs 32, 34, and 
36. Each delay element introduces a delay of one clock cycle. Delay element 
32 is the first delay element in the signal flow, delay element 34 the second, 
and delay element 36 the last. Each of the delay elements works with a 
complex sample. The output of the last delay element 36 in the signal flow is 
fed as the subtrahend into a subtracter 38. Subtractor 38 receives a second 
input from an adder 40. Adder 40 adds the current content of a register 42 and 
the output of the first delay element 32. Register 42 contains the last output of 
subtractor 38. 

Fig. 4 shows a signal at the output of the delayed autocorrelator 20 for two 
possible values for N d , keeping Navg = N d . In Figure 4, the signal \J(k)\ 2 is 
represented as a function of sample index, i.e. as a function of time, for N d =16 
(Fig. 4(A)) and N d =64 (Fig. 4(B)). The output J(k) is complex, and so the 
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square magnitude \J(k)\ 2 is represented. It can be observed that \J(k)\ 2 shows 
a region with a constant value {plateau) for a number of samples. In both 
cases N d =16 and N d =64, several regions showing a plateau are found. The 
length and position of the plateau directly depends on N d . The plateau is 
longer and occurs earlier for N d =16. 

For reasons that we will clarify in the next section, when explaining the carrier 
frequency offset estimation, the embodiment described here uses N d =64 and 
N avg =N d . The procedure to detect the frame is based on a plateau detection 
algorithm for the signal | J(k)\ 2 (see Figure 4 (B)). 

Figure 5 shows a structure of an embodiment of a plateau detector 50. The 
plateau detector 50 is separated into a differentiator block 52 and a peak 
detector block 54. Making use of a differentiator, the plateau detector 50 
implements an algorithm that tries to find out the point where the first plateau 
starts. This is done according to the following principle: At the onset of a 
plateau, the function \J(k)\ 2 is not differentiate and an ideal differentiator 
would show a discontinuity. A real differentiator has a limited bandwidth and 
so the discontinuity may not happen. In addition, as we are interested in a 
noise-robust system, this bandwidth has to be as small as possible. As the 
plateau is 32 samples long, we decided to use the differentiator shown in 
Figure 5 for the present embodiment, which is based on a delay line 56 of 
length 32 together with a subtracter 58. 

The peak detection block 54 is divided into two sub-blocks: group peak 
detector 58 and instantaneous peak detector 60. The structure of the peak 
detector block 54 will be explained in detail below with reference to Fig. 8. 

Figure 6 shows a sketch of the signal J^k) at the output of the differentiator 
block 52, when the input \J(k)\ 2 is as in Figure 4 (B) with N d =64. Both signals 
Jd»ff(k) and \J(k)\ 2 are shown in Fig. 6 on the same relative scale as a function 
of clock cycle (i.e., time) k. Interestingly, a peak is found at the output of this 
differentiator 52 at that point where the first plateau in | J(k)\ 2 starts. This is the 
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main reason why a peak detector algorithm is applied at the output of the 
differentiator. 

The autocorrelation block 20 together with the differentiator 52 and the peak 
detector 54 will constantly peer the channel. When the peak detector 54 
5 identifies an absolute maximum, the synchronizer will consider that a new 
frame has arrived and the carrier frequency offset estimator will be activated. 
Because of the noise (thermal, digital), the peak detection will not be a trivial 
task, i.e. a smart peak detection algorithm will be necessary in order to 
distinguish the absolute from the relative maxima. 

10 Fig. 8 shows a more detailed block diagram of a particular implementation of 
the 54 peak detector. 

The instantaneous peak detector 60 is composed of a comparator 70 and a 
counter 72. The present sample J dm (k) coming out from the differentiator 52 is 
compared with the last recorded maximum J max that is stored in a register 74. 
15 As long as the sample J am (k) is bigger than J max , the register 74 storing will 
be updated to contain the new sample J^k) as the latest maximum and the 
counter 72 will be reset 

If •/«<*) is smaller or equal than J max , the counter 72 will be triggered and it will 
increase its count by one. If this situation remains until the counter 72 counts 
through its full range, the instantaneous peak detector 60 will generate a 
signal stating that a relative peak was found inside the counting scope of the 
counter 72. In the implementation of Fig. 8 a 4-bit counter is used, which 
makes a counting scope of 16. 



20 
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The group peak detector 58 is used to detect falling slopes in J^k), and its 
main component is also a comparison block. If the group peak detector finds a 
falling edge at the same time as the instantaneous peak detector finds a 
relative peak, this means that the detected peak is actually an absolute peak. 

In the group peak detector, the input signal J dm (k) is accumulated in groups of 
six samples (6-tuples) in an accumulator 78 and the present group is 
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compared with the previous one. If it is smaller, it means that the falling slope 
has started. 

Figure 7 shows a detailed scheme of the several signals involved in the peak 
detection procedure. A signal FS (dashed line) is given by the group peak 
detector (falling slope detector). The signal FS is zero until shortly after a peak 
has occured in the signal J^k) (full line),which is shown here again as in Fig. 
6. At this point it switches to the value 1. When J^k) is detected to increase 
again, the signal FS switches back to zero. A signal C indicates the value of 
the counter 72. The counter signal C exhibits an increasing slope after the 
detection of a peak in the signal JdrnW until the full range of the counter is 
reached. After that, the signal C jumps back to zero. 

A signal FD (fat full line) represents the point in time where a decision on the 
existence of a peak is taken. FD is zero and exhibits a short peak at the point 
the counter signal 72 reaches its full range and the signal FS indicates a 
falling slope. 

We have to mention that a frame will be detected 16 samples after its actual 
starting point, i.e., the detection will occur in the middle of the first plateau in 
\J(k)\ 2 (dotted line in Fig. 7). However, this fact does not pose any problem, 
and the reason for the same will be given in the next section, where we 
explain the carrier frequency offset estimation. 

In the following, a device and method for carrier frequency offset estimation 
and correction are described with reference to Figs. 9 to 1 1 . 

During the RF down-conversion, the local oscillator (LO) normally is not 
exactly tuned to the expected frequency, but it will show some offset. As the 
different sub-carriers in an OFDM system overlap, any frequency offset in the 
receiver side may lead to significant Inter-Carrier-Interference (ICI). Once the 
synchronizer has been fired after frame detection, the next operation will be 
the estimation of this frequency offset Here we recover the expression given 
in equation (1) for the autocorrelation J(k), considering that 
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yoFF(?0 = y( n )- e ( 2 > 

where yonKn) is the signal affected by a normalized frequency offset 4 This 
normalization is with respect to the channel spacing Af \n the OFDM signal, 
which is 312.5 kHz in the IEEE 802.11a standard. The parameter f s is the 
inverse of the FFT time (fe = 20 MHz in the standard under consideration). The 
signal may then be rewritten as 

J{k) =e fs • £y\l -k) y(l -k -N d ) (3) 

If y(n) is a periodic signal with a period of N d samples, i.e. y(n) = y(n-N d ), then 
(3) can be simplified to 



J(k) =e fs - V kf (4) 



From equation (4) it can be seen that the phase of J(/c) is only due to f e , and so 
f z could be found as follows 

f * = o m ^ ^(rw) (5) 

However, there are several factors which destroy the periodicity, making y(n) * 
y(n-N d ). The most important ones are the AGC settling time and the channel 
impulse response. Another factor is noise, but its effect can be largely 
compensated by the averaging N avg in equation (4). In addition, if Navg is a 
multiple of the minimum periodicity in the preambles (16 samples in this 
standard), \J(k)\ 2 shows a plateau in the region where the phase of J(k) only 
depends on the carrier frequency offset. 



Considering what has been stated above, the frequency offset is in the 
present embodiment estimated after calculation of the phase of J(k) t at that 
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point k where the peak detector established the beginning of the frame, 
because this point falls exactly in the middle of the plateau in \J(k)\ 2 . 

Nevertheless, there is a limitation in the frequency offset estimation, which can 
be obtained by calculating the phase in the expression of equation (4), i.e. 

-7i <2nf z ^-N d <% (6) 

fs 

The ratio A/7 f s is 1/64 in the IEEE 802.1 1a standard. From (6) it can be seen 
that the range of possible estimated values for f c will only depend on the 
selected delay N d in the autocorrelator. 

Figure 9 shows the dependency of the phase of J*(fc) (conjugate value of J(k)) 
with respect to the input carrier frequency offset f 6 (normalized) for two 
particular values of N d . Afterwards these two values will be used to obtain two 
estimations for the frequency offset, a and p. 

In our implementation, f 6 is supposed to have its value in the range ±1.5, i.e. a 
frequency offset of ±468.75 kHz in an oscillator working in the 5 GHz ISM 
band. If the oscillator is tuned at 5.6 GHz, this value of expected offset is 
about 85 ppm (parts-per-million). 

We will estimate f z decomposing its value into two components a and p, and 
the actual estimation s will be a function of these two components, i.e. 
e=f(a,p). The value of a is restricted to be in the range ±0.5, and so is called 
the fine frequency offset The parameter p is called coarse frequency offset, 
and it will be in the range ±2.0 (see Figure 9). Our aim is to get proper 
estimations for these two parameters and combine them to finally obtain e. 
With the presented scheme used for frame detection it is possible to derive the 
value for a. 

Figure 10 shows an embodiment of a device 80 for frequency offset estimation 
and correction, which is based on the principles outlined above. In this device 
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two delayed autocorrelators 20 and 82 are used. Both are used for the 
frequency offset estimation, but only autocorrelator 20 of Fig. 1 (with N d =64) is 
also used for the frame detection. Autocorrelator 82 uses the signal after a 
delay of N d =16 . 

We may estimate the value for a by calculating the phase an output of this 
autocorrelator 20, J F (fc), which is the signal obtained before calculation its 
square amount The estimation of p can be obtained using autocorrelator 82 
with N d =16 and N^Nd and calculating also the phase of its output J c (k). 

The blocks 84 and 86 used to obtain a and p perform an arctangent 
calculation. This complex mathematical operation can be efficiently realized by 
using the CORDIC algorithm working in the vectoring mode. Although two 
arctangent blocks have been shown in Figure 10, only one is necessary in the 
actual implementation. After the frame has been detected by the peak detector 
50, the two samples J c {k) and Jf{k) will be stored in a register (not shown). 
The arctangent block 84, 86 will first calculate a from Mk) and afterwards p 
from Jc(k). More details on the CORDIC algorithm and its implementation can 
be found in the German Patent 101 64 462.0. 

The correction of the frequency offset is carried out considering the signal 
model given in eq. (2). To obtain the original signal y(n), the input signal 
yoFF(n) has to be multiplied by a phasor, which is the complex conjugate of the 
one found in eq. (2). This operation will be carried out by a Numerically 
Controlled Oscillator (NCO) 122, cf. Fig. 17, which is implemented using again 
the CORDIC algorithm, this time operating in the rotational mode. 

The dependency of a and p vs. f c was sketched in Figure 9. There it can be 
seen that this dependency is not completely lineal, but shows some 
discontinuities. The reason for these discontinuities are the phase leaps from 
+71 to -7T at the complex exponential in eq. (4). Thus, the final value for s 
cannot be a linear combination of the two estimations a and p. Thus one has 
to find a suitable way to combine a and p in order to determine e. 
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From Figure 9 it can also be seen that p has no discontinuities throughout the 
entire range of expected values for f e (±1.5). Nevertheless, p will be much 
more noisy than a since the moving average considers only 16 samples (see 
Figure 10). For this reason, p will provide a coarse approximation of the 
frequency offset, which will be further refined using a. 

The selected function s=f(a,p) is as follows 

s=a ; if (-0. 1 )/4<. p <; (0 . 1 )/4 (R1 ) 

s=a ; if a £ 0 and (0. 1 )IA< p < (0.9)/4 (R2) 

e=1 +ct ; if a < 0 and (0. 1 )/4< p < {0.9)1 A (R3) 

s=1 +a ; if p ^ (0.9)/4 (R4) 

s=a ; if a < 0 and (-0.9)/4< p < (-0.1 )/4 (R5) 

E =_1 + a ; if a > 0 and (-0.9)/4< p < (-0.1 )/4 (R6) 
e=-1+a ; if p <: (-0.9)/4 (R7) (7) 

In an alternative embodiment, the selected function is as follows: 



8=a 


; if (-0.25)/4< p <, (0.25)/4 




(R1) 


s=a 


; if a £ 0 and (0.25)/4< p < 


(0.75)/4 


(R2) 


e=1+a 


; if a < 0 and (0.25)/4< p < 


(0.75)/4 


(R3) 


8=1 +<x 


; if p > (0.75)/4 




(R4) 


e=— 1 +a 


; if a > 0 and (-0.75)/4< p 


< (-0.25)/4 


(R5) 


8=a 


; if a < 0 and (-0.75)/4< p 


< (-0.25)74 


(R6) 


• 

s=— 1 +a 


; if p ^ (-0.75)/4 




(R7) 



(8) 
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Figures 11a) and b) show the different decision regions for the combination of 
a and p in the frequency offset estimation and correction. In this expression, 
certain regions R1 to R7 have been defined for different values of a and p, 
which are also indicated in Figs. 11a) and b). The estimation e of the 
frequency offset will be assigned in combination block 88 depending on these 
particular regions. 

In the following paragraphs, an example of a method and device for symbol 
timing estimation will be described. 

Unlike to what was done during earner frequency offset estimation, where the 
periodicity of the short preamble symbols was the main feature used for the 
estimation, the symbol timing estimation will be obtained by exploiting the 
direct knowledge of the long preamble symbols 12 and 14. After the frame is 
detected, the synchronizer knows approximately which samples of the 
preamble have entered the delay line of the autocorrelator 20, but this 
knowledge is not enough for further processing and has to be refined. 

The main block of the symbol timing estimator is a crosscorrelator, cf. Fig. 17. 
Its main purpose is to compare the input frame with a reference signal, which 
is directly obtained from the long preamble symbol. The crosscorrelation can 
only be applied once the samples of the input frame have been corrected for 
the frequency offset by the NCO. 

Figure 12 shows a timing diagram of the whole synchronization procedure. 
This is important to show at which point the input frame has been already 
corrected for the frequency offset. Only when this correction is done to the 
input frame, the crosscorrelator can be used for the timing estimation. The 
figure shows also the portion of the long preamble symbols used as reference 
signal in the crosscorrelator. The "piece" of the long preamble symbols 
selected as the crosscorrelator reference CrefK^) is shown in Figure 12. The 
reference has a length of 32 complex samples, which is the shortest possible 
length for this reference in order to obtain appropriate results after 
crosscorrelation. 



WO 2004/008706 



-25- 



PCT/EP2003/007750 



Considering the standard IEEE 802.11a, the reference is as follows: 
c RE f(0..31) 

{0.1563, -O.0051-y0.1203, 0.0397-70.1112, 0.0968+/0.0828, 
0.021 1+y'0.0279, 0.0598-j0.0877, -0.1151-/3.0552, -0.0383-y0.1062, 0.0975- 
y0.0259, 0.0533+/0.0041, 0.0010-y0.1150, -0.1368-y0.0474, 0.0245-y0.0585, 
0.0587-y0.01 49, -0.0225+yO. 1 607, 0. 1 1 92-y0.0041 , 0.0625-y0.0625, 
0.0369+y0.0983, -0.0572+/0.0393, -0.1313+/0.0652, 0.0822+/0.0924, 
0.0696+/0.01 41, -0.0603+/0.081 3, -0.0565-/3.0218, -0.0350-/3.1509, 
-0.1219-y0.0166, -0.1273-y0.0205, 0.0751-/3.0740, -0.0028+/0.0538, 
-0.091 9+/0.1 151, 0.091 7+yO. 1059, 0.0123+/0.0976}, where /=V-1. 

From the implementation point of view, the complex crosscorrelator is usually 
a "weak* point in modern communication circuit designs because of its 
computation complexity (they require a large number of complex multipliers) 
and subsequent need for large silicon area. Having this in mind, in this 
implementation we used a simplified scheme for the crosscorrelator, based on 
simple XNOR 1-bit multipliers (Figure 13), that substitute the commonly used 
complex multipliers. 

Figure 13 shows a scheme of an XNOR-based complex multiplier. The 
multiplier comprises four XNOR-gates 90 to 96 The two signals to be 
multiplied are A=A real +/Ama g and B^B^t/B^, with j=V-1. Both real and 
imaginary parts are only one bit long, i.e. '1' if the signal is positive and '0' if it 
is negative. Thus, instead of multiplying N-b'rt complex numbers, the XNOR 
multiplier of Fig. 13 performs only the multiplication of the sign bits of the 
complex input values. 

In the following, the use of XNOR multipliers for complex multiplication is 

explained in more detail: 

The multiplication of two complex numbers A = A^ai + j Amag. and B = B^ + j 
Bjmag, where j = sqrt(-1), results in C = A x B, which is also complex with C^, = 

( Area! X B^, ) - ( Amag X B irnag )and C ima g = ( A imag X B^l ) + ( Area! X B lmag ) 
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This means that in principle four multipliers 90 to 96 and two adders 98 and 
100 are necessary. 

In this case, the multiplication of Area! x B rea i may comprise the following 
possible combinations: 



Areai 


Breal 


Afeal X B rea j 


0 


0 


1 


0 


1 


0 


1 


0 


0 


1 


1 


1 



This table is equivalent to a XNOR gate with inputs Area! and B rea) . i.e., this gate 
produces a 'O' at the output when the number of T inputs is odd. 



The previous operation has to be repeated 4 times in a complex multiplication. 
As mentioned before, in the present embodiment we represent Areai, Aimag, Breai 
and Bjmag with a single bit, i.e., if they are positive we use a '1' whereas when 
negative we use a '0\ Let dAreai, dAjmag. dBreai, dB^, dCreai and dCimag be the 
representation of Areai, Amag. Breai, Bimag, Creai and Cimag by a single bit, 
respectively. Now the value of dC rea i and dCimag is: 

dCreai = XNOR(dAreal, dBreai) -XNOR(dAimag, dBimag) 

dCimag = XNOR(dAimag, dBreai) + XNOR(dAreal, dBimag) 

An important point to be noted is that the original signals Areai, Breai, Aim ag and 
Bimag may take positive or negative values, but they have no DC level, i.e., on 
the average they are positive and negative the same amount of time. When 
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converting them into T or '0' a DC level is introduced which is going to be 0.5. 
After the XNOR operation, the non-zero DC offset still will be present. 
Nevertheless this is not a problem when calculating dC rea i because the DC 
levels get compensated due to the subtraction operation. But for the case of 
dCimag, both DC levels are accumulated. For this reason in Figure 13 of the 
there is a DC level subtraction 101. 

In addition, a further simplification in the structure of these multipliers is 
possible if one of the inputs is fixed and is known beforehand. The simplified 
structure of multipliers based on this knowledge is shown in Fig. 14 (A) to Fig. 
14(D). Figure 14 shows possible simplifications of the XNOR-based complex 
multiplier if the input B is known a priori. In this simplification, the XNOR gates 
may be replaced by NOT gates, which require a less number of transistors. 

If the value of B is known beforehand, only two rows in the previous logic table 
are to be considered. There are four possible cases: dB rea i=0, dB imag =0; 
dBreai=1, dBi mag =1; dB rea p1 , dBon^O; dB rea i=0, dB imag =1. In this case the XNOR 
gates are simplified to NOT gates, which require a smaller number of 
transistors. 

Considering an example dA^ x dB rea i knowing already that dBrea) = '0', 
according to the previous table, the result of the multiplication is the inverse of 
the input. This means that no XNOR gate is needed but just a NOT gate, such 
as for example NOT gate 102 in Fig. 14 (A). On the other hand, if dB^ is 
known to be '1\ the previous table says that the result of the multiplication is 
directly the input value dA^. Here, no gate is required at all, cf. for example 
Fig. 14 (B). The multipliers of Figs 14 (C) and (D) are built using similar 
considerations. 

The final results dC^ and dC imag are represented by 2 bits each. For 
comparison, if A^, Breaj. Amag and B^g were coded by for instance 16 bits, 
C rea j and C imag would comprise 32 bits if a full multiplication would have been 
used. 
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The final structure of a crosscorrelator 108 based on these principles is shown 
in Figure 15. It is composed of 32 delay elements 110 and 32 multipliers 112. 
The structure of each multiplier 1 12 is chosen in dependence on the reference 
signal Cref^I^.O)*, where we already considered the reference to be complex 
conjugated, hard-limited and order-reversed. The final reference looks as 
follows 

creK31,..,0)*={1. 1, 0, 0, 1+y. y, y, y, y. 0, 1, 1, 0, 0, 1, 1+y, 1+y, 0 f 1+y, 1+y,y, 
1+y, 1. 1+/. II 1+7. 1. 1. 1+y. 7. 1>. where H-1. 

The multiplication signals are added in and adder 114. The square magnitude 
of the output of adder 1 14 is calculated in a block 116. 

The output of the crosscorrelator 108 is shown in Figure 16 as a function of 
the numbers of clock cycles n, i.e., as a function of time. Two major peaks P1 
and P2 become visible at instants and aw Both peaks P1 and P2 will 
occur when the portions AB in the long preamble symbols (see Figure 12) are 
inside the crosscorrelator 108. For our purpose it is sufficient to detect the first 
peak P1 by setting a certain threshold at the output of the crosscorrelator 108. 
The 64 samples coming immediately after this first peak will be fed into an 
FFT block (CDAB fields marked with "First FFT symbol 0 in Figure 12) in order 
to get the reference channel estimation. 

The following description concerns the step of reference channel extraction. 
The reference channel estimation is used by the channel estimator in order to 
correct for the filtering due to the transmission channel. The reference channel 
will be obtained by calculating the FFT of the CDAB fields in Figure 1 2. This 
operation is performed by an FFT block 120 (cf. Fig. 17) and may start 
immediately after the crosscorrelator 108 decides the initial timing n^i. 
Nevertheless, the resulting FFT calculation has to be multiplied by the 
sequence (-1) m , m being the frequency variable. 

In the IEEE 802.11a standard, the actual long preamble symbols are defined 
as ABCD, i.e. a cyclic delay of 32 samples into a sequence of 64 samples is 
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introduced. After performing the FFT, any time delay is seen as a linear 
phase, which in this specific case is reduced to the sequence exp{- 
- /27c(32/64)m}=exp{-^im}=(-1)' n . In order to compensate for this linear phase, 
the FFT output has to be multiplied exactly by this sequence. 

The phase compensation mentioned above is only necessary when 
calculating the reference channel estimation. For the data symbols coming 
after the preamble no phase correction will be necessary. 

Figure 17 shows a general synchronizer scheme with different clock domains. 
Each clock domain is activated when it has to carry out some kind of 
operation. If not, it is disabled in order to reduce the power consumption of the 
whole synchronizer. The clock domains are indicated by different hatches in 
Fig. 17. A first clock domain comprises the autororrelators 20 and 82 (cf. also 
Fig. 10), peak detector 50, and the arctan-block 84/86. A second clock domain 
comprises NCO 122 and FFT block 120. A third clock domain includes 
crosscorrelator 108. 

There are two further operations to be performed inside the FFT block 120 in 
Figure 17. As the synchronizer already knows the timing of the input samples, 
it can detect the end of the preamble symbols and the beginning of the data 
symbols. The data symbols will go through a cyclic prefix extraction block (not 
shown) prior to the FFT calculation. The cyclic prefix extraction is mainly 
formed by a counter. For each data symbol, 80 samples are expected. The 
first 16 will be discarded and the last 64 will be fed into the FFT block 120. The 
cyclic prefix is inserted in the OFDM symbols in order to prevent the Inter- 
SymboWnterference (ISl) caused by the channel filtering. 

The last operation included inside the FFT block 120 is the channel 
reordering. Immediately after the FFT calculation, the output samples are 
delivered in the serial form according to the natural order. For further 
processing, this order has to be changed. Specifically, the data after FFT is 
given in the following order D 0( D 1f D 2 , .... D 31l D-32, D-31. D.30, D.l where 
the subindex indicates the corresponding sub-channel. But the samples D 0l 
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D 27 , D 28 , D 29l D30, D31, D^, D.31, D.30, D.29. D.28 and D. 27 (11 samples in total) 
carry no information and are directly discarded. The remaining 52 samples {D k 
/ ke[1,26]u[-26,-1]} are provided at the output of the FFT block 120 in the 
following order: D- 21 , D. 7f D 7l D 2il D. 26 , D. 25 , D.^, D_ 20l D. 19> D. 18 , D-s, 
D^, D-5, .... D. 1f D 1f D 5l D 6 , D 8 , .... D 18l D 19l D^, D^, .... D^, D 26 . 

The actual implementation of the FFT processor 120 itself is not going to be 
explained in detail here. The main idea behind this implementation has been a 
mathematical manipulation of the definition of the FFT in order to convert the 
64-point FFT into a 2-<Jimensional 8x8-point FFT. For more details see the 
German Patent application DE 100 62 759.5. 

Figure 18 shows a representation of the two operation modes of the circular 
CORDIC algorithm being used in the synchronizer. 

The CORDIC processor is a part of the synchronizer. The processor performs 
the circular CORDIC algorithm in both of its operation modes, viz. rotational 
and vectoring. While the rotation mode of operation of the CORDIC enables 
us to compute the multiplication of any quantity with a phasor 
exp{/cp}=cos(9)+^in(q>), the vectoring mode can be used for computing the 
magnitude and the phase angle of a complex value. The operation principle of 
these two modes is shown in Figure 18. 

The rotational mode is useful to implement the NCO 122 whereas, the 
vectoring mode is useful for arctangent calculations in block 84,86. However, 
for the realization of the synchronizer, these two modes of operation of the 
CORDIC are independently utilized in two different phases. At first, the phases 
of Mk) and J c {k) (see Figure 10) are computed (vectoring mode of operation) 
to obtain a and p, respectively. After calculation of s by combining a and p, 
this phase angle is used to obtain a phase correction for the incoming data 
symbols applying a numerically controlled oscillator (NCO) 122 in the rotation 
mode of operation. The phase angle evaluation takes place only once at the 
beginning of the frame (after frame detection) whereas, for the rest of the 
frame the NCO operation is carried out. Thus, in our consideration, it is more 
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pragmatic to separate the two modes of CORDIC operation and realize them 
in the form of two separate modules. This separation of operations opens up 
the possibility for applying clock gating to save the power corresponding to the 
NCO 122 while the computation of the phase angle is carried on and vice 
versa. On the other hand, this separation results in a massive reduction of 
control hardware as no reuse of the component takes place that subsequently 
eliminates any feedback path. Another advantage of separating these two 
functionalities is that, in this case, it is possible to implement each of them in a 
much simplified and efficient manner. 

When using the rotational CORDIC as an NCO, the input signal is multiplied 
by a phasor with the form exp{j(27t/64)-e-/7}, where e is the estimated 
normalized carrier frequency offset (see Figure 10) and n is the time variable. 
Thus, the variable q> in Figure 18.a depends on n, and it is updated at each 
clock cycle. This is done by adding a phase accumulator to the input <p in the 
CORDIC processor. 

We- have to mention here that the value of e must not be sign-reversed in 
order to apply a correct phase correction, because the result coming from the 
arctangent calculation already considers this sign. This can be easily seen in 
expression (4), where by definition, the phase contribution in J(k) already 
contains the -ve sign. 

The actual implementation of the rotational and vectoring CORDIC processors 
is described in more detail in the German Patent 101 64 462.0. 

In summary, the the proposed synchronizer has the following advantages: 

The proposed synchronizer has been designed as a power efficient system. 
This has been achieved in one side by optimizing each block independently 
and on the other side by dividing the whole synchronizer structure into 
different clock domains (see Figure 17). 

A clock demain separation helps to save power in the sense that only certain 
regions of the system are activated for operation, while others are not, by 
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applying the clock gating. Thus, as shown in Figure 17, three clock domains 
are used here. The blocks belonging to the clock domain #1 peer the channel 
trying to detect an incoming frame through the peak detector algorithm. When 
this is true, the arctangent calculation is triggered, but it will operate only on 
5 two samples, thus obtaining two single values for a and 0. Afterwards this 
clock domain is disabled. The combination of a and p to finally obtain e can be 
done using combinatorial logic, thus requiring no clock at all. 

Disabling the clock domain #1 , it is possible to save the power that otherwise 
would be consumed by the huge delay line at the input of the autocorrelator as 
10 well as the FIR structures of the moving average blocks. 

Once the value for s is available, the NCO 122 will start its operation until the 
end of the frame. The crosscorrelator 108 will be activated at the same time, 
but it will operate only until a peak is found at its output being afterwards 
disabled. This is achieved by assigning a particular clock domain to the 
1 5 crosscorrelator 1 08 (clock domain #3). 

The FFT block 120 is activated after a peak is detected at the output of the 
crosscorrelator and will operate like the NCO 122, until the end of the 
incoming frame, for that reason both the NCO 120 and the FFT blocks 120 
belong to the same clock domain #2. 

20 Apart from the power saving in the system level by applying clock gating, even 
more power saving is achieved by optimising each single block in the 
synchronizer as described above. 

Thus, the frame detection algorithm is based on the knowledge of the ideal 
shape of the signal MfCOI 2 - A simple and robust peak detection algorithm is 
25 applied to this signal in order to detect an incoming frame. The arctangent 
calculation as well as the NCO 122 are based on the CORDIC algorithm. The 
CORDIC processors designed for these purposes have been optimised to 
reach the final angle in an adaptive way and thereby executing a minimum 
number of iteration steps. Furthermore, the crosscorrelator has been simplified 
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to use XNOR-based complex multipliers instead of the normal complex 
multipliers and the reference signal therein has been shorten as much as 
possible in order to obtain valid results. Last but not least, the FFT processor 
120 has been also optimised by using a new architecture, which requires no 
5 coefficient storing or complex multiplier. 

The whole structure introduces a latency of 3.9 jis counted from the instant 
when the frame is detected. This value is less than one OFDM symbol period 
(4 |xs), which means that no extra storage of the input samples has to be done 
inside the synchronizer. 

10 In summary, there are the following inventions comprised in the above 
description: 

The algorithm used for the frame detection, making use of a simplified 
differentiator to obtain an absolute maximum in the differentiated signal at that 
point where the first plateau in Mk) starts (output of the autocorrelator with 
15 N d =64); 

The design of the peak detector to obtain the position of the absolute 
maximum in the differentiated signal, dividing the problem into relative peak 
detection and falling edge detection; 

The way to combine the two frequency offsets a and p to finally obtain e; 

20 The use of a 32-sample long reference signal in the crosscorrelator for timing 
estimation; 

The use of our simplified XNOR-based crosscorrelator, and the simplifications 
therein based on the knowledge of the reference; 

The use of our particular solution for the CORDIC algorithm in the vectoring 
25 mode for arctangent calculation; 

The use of our particular solution for the CORDIC algorithm as NCO for the 
frequency offset correction; 
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The hardware structuring of the whole synchronizer, allowing a very simple 
control mechanism and the separation of this structure into different clock 
domains, each one being activated only to perform its operation and 
deactivated afterwards. 

5 The following is a table relating terms used in the present application to those 
used in European patent application 02 090 257.3, whose priority is claimed, 
with the same meaning. 



Priority application 


This application 


third signal 


autocorrelation signal 


fourth signal 


second siqnal 


fifth siqnal 


differentiator signal 


sixth signal 


instantaneous peak detection signal 


seventh siqnal 


group peak detection signal 


eiqhth signal 


maximum detection signal 


first predetermined number N d 
(claim 2) 


third predetermined number N d 


second predetermined number N aV g 
(claim 3) 


fourth predetermined number Navg 


third predetermined number 
(claim 8) 


first predetermined number 


fourth predetermined number 
(claim 14) 


predetermined accumulation number 


predetermined range of sampling 

periods 

(claim 7) 


second predetermined number of 
sampling periods 


transforming step (claims 1 and 2) 


step b), autocorrelation step 
(claimD 



